Closing the Gap: Improved Bounds on Optimal POMDP Solutions

نویسندگان

Pascal Poupart

Kee-Eung Kim

Dongho Kim

چکیده

POMDP algorithms have made significant progress in recent years by allowing practitioners to find good solutions to increasingly large problems. Most approaches (including point-based and policy iteration techniques) operate by refining a lower bound of the optimal value function. Several approaches (e.g., HSVI2, SARSOP, grid-based approaches and online forward search) also refine an upper bound. However, approximating the optimal value function by an upper bound is computationally expensive and therefore tightness is often sacrificed to improve efficiency (e.g., sawtooth approximation). In this paper, we describe a new approach to efficiently compute tighter bounds by i) conducting a prioritized breadth first search over the reachable beliefs, ii) propagating upper bound improvements with an augmented POMDP and iii) using exact linear programming (instead of the sawtooth approximation) for upper bound interpolation. As a result, we can represent the bounds more compactly and significantly reduce the gap between upper and lower bounds on several benchmark problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Numerical Formulation on Crack Closing Effect In Buckling Analysis of Edge-Cracked Columns

In this paper, buckling of simply supported column with an edge crack is investigated numerically and analytically. Four different scenarios of damage severities are applied to a column, open crack assumption and the effect of closing crack in stability of the column which depends on position and size of cracks, are numerically compared. Crack surfaces contact is modeled with GAP element using ...

متن کامل

α-min: A Compact Approximate Solver For Finite-Horizon POMDPs

In many POMDP applications in computational sustainability, it is important that the computed policy have a simple description, so that it can be easily interpreted by stakeholders and decision makers. One measure of simplicity for POMDP value functions is the number of α-vectors required to represent the value function. Existing POMDP methods seek to optimize the accuracy of the value function...

متن کامل

Cost minimization in multi-commodity multi-mode generalized networks with time windows

Cost Minimization in Multi−Commodity, Multi−Mode Generalized Networks with Time Windows. (December 2005) Ping-Shun Chen, B.S., National Chiao Tung University, Taiwan; M.S., National Chiao Tung University, Taiwan Chair of Advisory Committee: Dr. Alberto Garcia-Diaz The purpose of this research is to develop a heuristic algorithm to minimize total costs in multi-commodity, multi-mode generalized ...

متن کامل

A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems

Maintenance can be the factor of either increasing or decreasing system's availability, so it is valuable work to evaluate a maintenance policy from cost and availability point of view, simultaneously and according to decision maker's priorities. This study proposes a Partially Observable Markov Decision Process (POMDP) framework for a partially observable and stochastically deteriorating syste...

متن کامل

Improved integrality gap upper bounds for TSP with distances one and two

We study the structure of solutions to linear programming formulations for the traveling salesperson problem (TSP). We perform a detailed analysis of the support of the subtour elimination linear programming relaxation, which leads to algorithms that find 2-matchings with few components in polynomial time. The number of components directly leads to integrality gap upper bounds for the TSP with ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Closing the Gap: Improved Bounds on Optimal POMDP Solutions

نویسندگان

چکیده

منابع مشابه

Numerical Formulation on Crack Closing Effect In Buckling Analysis of Edge-Cracked Columns

α-min: A Compact Approximate Solver For Finite-Horizon POMDPs

Cost minimization in multi-commodity multi-mode generalized networks with time windows

A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems

Improved integrality gap upper bounds for TSP with distances one and two

عنوان ژورنال:

اشتراک گذاری